library(tidyverse)
library(psych)
library(cowplot)
library(patchwork)
library(here)
library(brms)
library(tidybayes)
Often, there are opportunities to cluster your observations – repeated measures, group membership, hierarchies, even different measures for the same particiapnt. Whenever you can cluster, you should!
Aggregation * Between person H1: Do students who study more get better grades? * Within person H2: When a student studies, do they get better grades? * H1 and H2 are independent from one another! Aggregation collapses the two. When you have nested data with many DVs it is important to not aggregate.
McElreath doesn’t cover this in his video lecture, but this is from the textbook and worth discussing.
Data from Silk et al. (2005)
From McElreath:
The data for this example come from an experiment aimed at evaluating the prosocial tendencies of chimpanzees (Pan troglodytes). The experimental structure mimics many common experiments conducted on human students (Homo sapiens studiensis) by economists and psychologists. A focal chimpanzee sits at one end of a long table with two levers, one on the left and one on the right in this figure. On the table are four dishes which may contain desirable food items. The two dishes on the right side of the table are attached by a mechanism to the right-hand lever. The two dishes on the left side are similarly attached to the left-hand lever.
When either the left or right lever is pulled by the focal animal, the two dishes on the same side slide towards opposite ends of the table. This delivers whatever is in those dishes to the opposite ends. In all experimental trials, both dishes on the focal animal’s side contain food items. But only one of the dishes on the other side of the table contains a food item. Therefore while both levers deliver food to the focal animal, only one of the levers delivers food to the other side of the table.
There are two experimental conditions. In the partner condition, another chimpanzee is seated at the opposite end of the table, as pictured in the figure. In the control condition, the other side of the table is empty. Finally, two counterbalancing treatments alternate which side, left or right, has a food item for the other side of the table. This helps detect any handedness preferences for individual focal animals.
When human students participate in an experiment like this, they nearly always choose the lever linked to two pieces of food, the prosocial option, but only when another student sits on the opposite side of the table. The motivating question is whether a focal chimpanzee behaves similarly, choosing the prosocial option more often when another animal is present. In terms of linear models, we want to estimate the interaction between condition (presence or absence of another animal) and option (which side is prosocial).
data(chimpanzees, package="rethinking")
d <- chimpanzees
rethinking::precis(d)
## mean sd 5.5% 94.5% histogram
## actor 4.0000000 2.0019871 1.000 7.000 ▇▇▁▇▁▇▁▇▁▇▁▇
## recipient 5.0000000 2.0039801 2.000 8.000 ▇▇▁▇▁▇▁▇▁▇▁▇
## condition 0.5000000 0.5004968 0.000 1.000 ▇▁▁▁▁▁▁▁▁▇
## block 3.5000000 1.7095219 1.000 6.000 ▇▇▁▇▁▇▁▇▁▇
## trial 36.5000000 20.8032533 4.665 68.335 ▇▇▇▇▇▇▇▁
## prosoc_left 0.5000000 0.5004968 0.000 1.000 ▇▁▁▁▁▁▁▁▁▇
## chose_prosoc 0.5674603 0.4959204 0.000 1.000 ▅▁▁▁▁▁▁▁▁▇
## pulled_left 0.5793651 0.4941515 0.000 1.000 ▅▁▁▁▁▁▁▁▁▇
unique(d$actor)
## [1] 1 2 3 4 5 6 7
unique(d$block)
## [1] 1 2 3 4 5 6
unique(d$prosoc_left)
## [1] 0 1
unique(d$condition)
## [1] 0 1
We could model the interaction between condition (presence/absence of another animal) and option (which side is prosocial), but it is more difficult to assign sensible priors to interaction effects. Another option, because we’re working with categorical variables, is to turn our 2x2 into one variable with 4 levels.
d$treatment <- factor(1 + d$prosoc_left + 2*d$condition)
d %>% count(treatment, prosoc_left, condition)
## treatment prosoc_left condition n
## 1 1 0 0 126
## 2 2 1 0 126
## 3 3 0 1 126
## 4 4 1 1 126
t_labels = c("r/n", "l/n", "r/p", "l/p")
In this experiment, each pull is within a cluster of pulls belonging to an individual chimpanzee. But each pull is also within an experimental block, which represents a collection of observations that happened on the same day. So each observed pull belongs to both an actor (1 to 7) and a block (1 to 6). There may be unique intercepts for each actor as well as for each block.
Mathematical model:
\[\begin{align*} L_i &\sim \text{Binomial}(1, p_i) \\ \text{logit}(p_i) &= \bar{\alpha} + \alpha_{\text{ACTOR[i]}} + \bar{\gamma} + \gamma_{\text{BLOCK[i]}} + \beta_{\text{TREATMENT[i]}} \\ \beta_j &\sim \text{Normal}(0, 0.5) \text{ , for }j=1..4\\ \alpha_j &\sim \text{Normal}(0, \sigma_{\alpha}) \text{ , for }j=1..7\\ \gamma_j &\sim \text{Normal}(0, \sigma_{\gamma}) \text{ , for }j=1..7\\ \bar{\alpha} &\sim \text{Normal}(0, 1.5) \\ \bar{\gamma} &\sim \text{Normal}(0, 1.5) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\gamma} &\sim \text{Exponential}(1) \\ \end{align*}\]
m1 <-
brm(
family = bernoulli,
data = d,
pulled_left ~ 0 + treatment + (1 | actor) + (1 | block),
prior = c(prior(normal(0, 1.5), class = b),
prior(exponential(1), class = sd, group = actor),
prior(exponential(1), class = sd, group = block)),
chains=4, cores=4, iter=2000, warmup=1000,
seed = 1,
file = here("files/models/71.1")
)
m1
## Family: bernoulli
## Links: mu = logit
## Formula: pulled_left ~ 0 + treatment + (1 | actor) + (1 | block)
## Data: d (Number of observations: 504)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Multilevel Hyperparameters:
## ~actor (Number of levels: 7)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.96 0.63 1.06 3.45 1.00 1250 1660
##
## ~block (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.20 0.17 0.01 0.63 1.00 1351 1690
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## treatment1 0.25 0.57 -0.90 1.38 1.00 1083 1577
## treatment2 0.85 0.57 -0.31 1.94 1.00 1044 1743
## treatment3 -0.16 0.57 -1.33 0.94 1.00 1062 1573
## treatment4 0.72 0.57 -0.46 1.81 1.00 1022 1565
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
posterior_summary(m1)
## Estimate Est.Error Q2.5 Q97.5
## b_treatment1 0.25346086 0.5690543 -8.956685e-01 1.3752420
## b_treatment2 0.85408515 0.5716742 -3.133519e-01 1.9394030
## b_treatment3 -0.16255802 0.5734419 -1.331065e+00 0.9439413
## b_treatment4 0.72272758 0.5735628 -4.574810e-01 1.8120448
## sd_actor__Intercept 1.95765160 0.6271925 1.058292e+00 3.4481233
## sd_block__Intercept 0.20429722 0.1714939 7.619742e-03 0.6298440
## r_actor[1,Intercept] -0.76806254 0.5871299 -1.884859e+00 0.4127745
## r_actor[2,Intercept] 4.16869772 1.2800422 2.187441e+00 7.2504878
## r_actor[3,Intercept] -1.07340338 0.5831362 -2.164617e+00 0.1178102
## r_actor[4,Intercept] -1.06777362 0.5979555 -2.198872e+00 0.1664252
## r_actor[5,Intercept] -0.76831595 0.5940808 -1.901701e+00 0.4779332
## r_actor[6,Intercept] 0.18036694 0.5931503 -9.723677e-01 1.3304202
## r_actor[7,Intercept] 1.71737307 0.6527872 4.933997e-01 3.0887395
## r_block[1,Intercept] -0.15780279 0.2170111 -6.890853e-01 0.1378163
## r_block[2,Intercept] 0.03741492 0.1817171 -3.388320e-01 0.4520013
## r_block[3,Intercept] 0.05159167 0.1837081 -2.778782e-01 0.4987016
## r_block[4,Intercept] 0.01488767 0.1817779 -3.471849e-01 0.4117812
## r_block[5,Intercept] -0.02724225 0.1824159 -4.304912e-01 0.3526096
## r_block[6,Intercept] 0.10603610 0.1960600 -2.018372e-01 0.5857896
## lprior -8.04858139 0.8743820 -1.025541e+01 -6.8249218
## lp__ -288.90426240 3.8659615 -2.973491e+02 -282.5082745
m1 %>%
mcmc_plot(variable = c("^r_", "^b_", "^sd_"), regex = T) +
theme(axis.text.y = element_text(hjust = 0))
Zooming in on just the actor and block effects. (Remember, these are differences from the weighted average.)
m1 %>%
mcmc_plot(variable = c("^r_"), regex = T) +
theme(axis.text.y = element_text(hjust = 0))
as_draws_df(m1) %>%
select(starts_with("sd")) %>%
pivot_longer(everything()) %>%
ggplot(aes(x = value, fill = name)) +
geom_density(linewidth = 0, alpha = 3/4, adjust = 2/3, show.legend = F) +
annotate(geom = "text", x = 0.67, y = 2, label = "block", color = "#5e8485") +
annotate(geom = "text", x = 2.725, y = 0.5, label = "actor", color = "#0f393a") +
scale_fill_manual(values = c("#0f393a", "#5e8485")) +
scale_y_continuous(NULL, breaks = NULL) +
ggtitle(expression(sigma["group"])) +
coord_cartesian(xlim = c(0, 4))
## Warning: Dropping 'draws_df' class as required metadata was removed.
Let’s look at the predictions by actor and block to confirm.
d %>% distinct(actor, block, treatment) %>%
add_epred_draws(m1) %>%
mutate(treatment=factor(treatment,
levels=as.character(1:4),
labels=t_labels)) %>%
group_by(actor, treatment) %>%
mean_qi(.epred) %>%
ggplot( aes(x=treatment, y=.epred, group=1) ) +
geom_point() +
geom_line() +
labs(x=NULL, y="p(pull left)", title="by actor") +
facet_wrap(~actor)
d %>% distinct(actor, block, treatment) %>%
add_epred_draws(m1) %>%
mutate(treatment=factor(treatment,
levels=as.character(1:4),
labels=t_labels)) %>%
group_by(block, treatment) %>%
mean_qi(.epred) %>%
ggplot( aes(x=treatment, y=.epred, group=1) ) +
geom_point() +
geom_line() +
labs(x=NULL, y="p(pull left)", title="by block") +
facet_wrap(~block)
Let’s start by simulating cafe data.
# ---- set population-level parameters -----
a <- 3.5 # average morning wait time
b <- (-1) # average difference afternoon wait time
sigma_a <- 1 # std dev in intercepts
sigma_b <- 0.5 # std dev in slopes
rho <- (-0.7) #correlation between intercepts and slopes
# ---- create vector of means ----
Mu <- c(a, b)
# ---- create matrix of variances and covariances ----
sigmas <- c(sigma_a,sigma_b) # standard deviations
Rho <- matrix( c(1,rho,rho,1) , nrow=2 ) # correlation matrix
# now matrix multiply to get covariance matrix
Sigma <- diag(sigmas) %*% Rho %*% diag(sigmas)
# ---- simulate intercepts and slopes -----
N_cafes = 20
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:patchwork':
##
## area
## The following object is masked from 'package:dplyr':
##
## select
set.seed(5)
vary_effects <- mvrnorm( n=N_cafes, mu = Mu, Sigma=Sigma)
a_cafe <- vary_effects[, 1]
b_cafe <- vary_effects[, 2]
# ---- simulate observations -----
set.seed(22)
N_visits <- 10
afternoon <- rep(0:1,N_visits*N_cafes/2)
cafe_id <- rep( 1:N_cafes , each=N_visits )
mu <- a_cafe[cafe_id] + b_cafe[cafe_id]*afternoon
sigma <- 0.5 # std dev within cafes
wait <- rnorm( N_visits*N_cafes , mu , sigma )
d <- data.frame( cafe=cafe_id , afternoon=afternoon , wait=wait )
rethinking::precis(d)
## mean sd 5.5% 94.5% histogram
## cafe 10.500000 5.7807513 2.000000 19.000000 ▇▇▇▇▇▇▇▇▇▇
## afternoon 0.500000 0.5012547 0.000000 1.000000 ▇▁▁▁▁▁▁▁▁▇
## wait 3.073405 1.1082252 1.477719 4.869957 ▁▁▂▅▇▇▅▇▃▃▁▁▁▁
d %>% filter(cafe==1)
## cafe afternoon wait
## 1 1 0 3.967893
## 2 1 1 3.857198
## 3 1 0 4.727875
## 4 1 1 2.761013
## 5 1 0 4.119483
## 6 1 1 3.543652
## 7 1 0 4.190949
## 8 1 1 2.533223
## 9 1 0 4.124032
## 10 1 1 2.764887
In this exercise, we are simulating data from a generative process and then analyzing that data with a model that reflects exactly the correct structure of that process. But in the real world, we’re never so lucky. Instead we are always forced to analyze data with a model that is MISSPECIFIED: The true data-generating process is different than the model. Simulation can be used however to explore misspecification. Just simulate data from a process and then see how a number of models, none of which match exactly the data-generating process, perform. And always remember that Bayesian inference does not depend upon data-generating assumptions, such as the likelihood, being true. Non-Bayesian approaches may depend upon sampling distributions for their inferences, but this is not the case for a Bayesian model. In a Bayesian model, a likelihood is a prior for the data, and inference about parameters can be surprisingly insensitive to its details.
Mathematical model:
likelihood function and linear model
\[\begin{align*} W_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{CAFE[i]} + \beta_{CAFE[i]}(\text{Afternoon}_i) \end{align*}\]
varying intercepts and slopes
\[\begin{align*} \begin{bmatrix} \alpha_{CAFE[i]} \\ \beta_{CAFE[i]} \end{bmatrix} &\sim \text{MVNormal}( \begin{bmatrix} \alpha \\ \beta \end{bmatrix}, \mathbf{S}) \\ \mathbf{S} &\sim \begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix} \\ \end{align*}\]
priors
\[\begin{align*} \alpha &\sim \text{Normal}(5,2) \\ \beta &\sim \text{Normal}(-1,0.5) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
# examples
rlkj_1 = rethinking::rlkjcorr(1e4, K=2, eta=1)
rlkj_2 = rethinking::rlkjcorr(1e4, K=2, eta=2)
rlkj_4 = rethinking::rlkjcorr(1e4, K=2, eta=4)
rlkj_6 = rethinking::rlkjcorr(1e4, K=2, eta=6)
data.frame(rlkj_1= rlkj_1[,1,2],
rlkj_2= rlkj_2[,1,2],
rlkj_4= rlkj_4[,1,2],
rlkj_6= rlkj_6[,1,2]) %>%
ggplot() +
geom_density(aes(x=rlkj_1, color = "1"), alpha=.3) +
geom_density(aes(x=rlkj_2, color = "2"), alpha=.3) +
geom_density(aes(x=rlkj_4, color = "4"), alpha=.3) +
geom_density(aes(x=rlkj_6, color = "6"), alpha=.3) +
labs(x="R", color="eta") +
theme(legend.position = "top")
m2 <- brm(
data = d,
family = gaussian,
wait ~ 1 + afternoon + (1 + afternoon | cafe),
prior = c(
prior( normal(5,2), class=Intercept ),
prior( normal(-1, .5), class=b),
prior( exponential(1), class=sd),
prior( exponential(1), class=sigma),
prior( lkj(2), class=cor)
),
iter=2000, warmup=1000, chains=4, cores=4, seed=9,
file=here("files/models/71.2")
)
posterior_summary(m2) %>% round(2)
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 3.66 0.23 3.21 4.11
## b_afternoon -1.13 0.14 -1.41 -0.85
## sd_cafe__Intercept 0.97 0.17 0.71 1.37
## sd_cafe__afternoon 0.59 0.12 0.39 0.87
## cor_cafe__Intercept__afternoon -0.51 0.18 -0.80 -0.10
## sigma 0.47 0.03 0.42 0.53
## Intercept 3.10 0.20 2.71 3.50
## r_cafe[1,Intercept] 0.55 0.29 -0.01 1.14
## r_cafe[2,Intercept] -1.50 0.30 -2.10 -0.90
## r_cafe[3,Intercept] 0.70 0.30 0.12 1.30
## r_cafe[4,Intercept] -0.42 0.29 -0.98 0.15
## r_cafe[5,Intercept] -1.79 0.30 -2.36 -1.20
## r_cafe[6,Intercept] 0.60 0.29 0.02 1.18
## r_cafe[7,Intercept] -0.05 0.29 -0.64 0.55
## r_cafe[8,Intercept] 0.28 0.30 -0.31 0.86
## r_cafe[9,Intercept] 0.32 0.29 -0.28 0.90
## r_cafe[10,Intercept] -0.10 0.29 -0.68 0.49
## r_cafe[11,Intercept] -1.74 0.29 -2.31 -1.17
## r_cafe[12,Intercept] 0.18 0.29 -0.40 0.76
## r_cafe[13,Intercept] 0.22 0.30 -0.37 0.82
## r_cafe[14,Intercept] -0.49 0.30 -1.08 0.09
## r_cafe[15,Intercept] 0.79 0.30 0.22 1.39
## r_cafe[16,Intercept] -0.27 0.29 -0.86 0.32
## r_cafe[17,Intercept] 0.55 0.30 -0.03 1.15
## r_cafe[18,Intercept] 2.08 0.30 1.52 2.68
## r_cafe[19,Intercept] -0.42 0.30 -1.00 0.16
## r_cafe[20,Intercept] 0.07 0.30 -0.53 0.65
## r_cafe[1,afternoon] -0.03 0.29 -0.60 0.53
## r_cafe[2,afternoon] 0.23 0.30 -0.36 0.79
## r_cafe[3,afternoon] -0.80 0.30 -1.39 -0.24
## r_cafe[4,afternoon] -0.10 0.28 -0.65 0.43
## r_cafe[5,afternoon] 1.00 0.30 0.42 1.57
## r_cafe[6,afternoon] -0.16 0.28 -0.72 0.38
## r_cafe[7,afternoon] 0.10 0.28 -0.46 0.64
## r_cafe[8,afternoon] -0.50 0.29 -1.09 0.07
## r_cafe[9,afternoon] -0.17 0.28 -0.73 0.38
## r_cafe[10,afternoon] 0.18 0.29 -0.39 0.75
## r_cafe[11,afternoon] 0.70 0.29 0.15 1.27
## r_cafe[12,afternoon] -0.06 0.29 -0.63 0.50
## r_cafe[13,afternoon] -0.68 0.29 -1.25 -0.12
## r_cafe[14,afternoon] 0.19 0.29 -0.36 0.78
## r_cafe[15,afternoon] -1.06 0.31 -1.65 -0.44
## r_cafe[16,afternoon] 0.09 0.28 -0.47 0.64
## r_cafe[17,afternoon] -0.09 0.28 -0.64 0.47
## r_cafe[18,afternoon] 0.10 0.30 -0.49 0.70
## r_cafe[19,afternoon] 0.87 0.30 0.29 1.47
## r_cafe[20,afternoon] 0.07 0.29 -0.49 0.66
## lprior -5.06 0.44 -6.07 -4.36
## lp__ -197.20 7.16 -212.07 -184.27
Let’s get the slopes and intercepts for each cafe.
intercepts = coef(m2)$cafe[ ,, "Intercept"]
slopes = coef(m2)$cafe[,, "afternoon"]
cafe_params = data.frame(
cafe=1:20,
intercepts=intercepts[, 1],
slopes=slopes[, 1]
)
cafe_params %>% round(3)
## cafe intercepts slopes
## 1 1 4.215 -1.156
## 2 2 2.158 -0.904
## 3 3 4.367 -1.928
## 4 4 3.243 -1.232
## 5 5 1.875 -0.135
## 6 6 4.260 -1.294
## 7 7 3.617 -1.027
## 8 8 3.945 -1.633
## 9 9 3.979 -1.305
## 10 10 3.563 -0.953
## 11 11 1.927 -0.430
## 12 12 3.841 -1.186
## 13 13 3.881 -1.809
## 14 14 3.175 -0.938
## 15 15 4.455 -2.192
## 16 16 3.390 -1.040
## 17 17 4.217 -1.219
## 18 18 5.748 -1.028
## 19 19 3.247 -0.260
## 20 20 3.729 -1.056
cafe_params %>%
ggplot( aes(x=intercepts, y=slopes) ) +
geom_point(size = 2)
cafe_params %>%
ggplot( aes(x=intercepts, y=slopes) ) +
stat_ellipse() +
geom_point(size = 2)
cafe_params %>%
ggplot( aes(x=intercepts, y=slopes) ) +
mapply(function(level) {
stat_ellipse(geom = "polygon", type = "norm",
linewidth = 0, alpha = .1, fill = "#1c5253",
level = level)
},
# enter the levels here
level = c(1:9 / 10, .99)) +
geom_point(size = 2)
More about stat_ellipse here.
Now use the slopes and intercepts to calculate the expected morning and afternoon wait times for each cafe. Plot these as a scatterplot. Bonus for ellipses.
cafe_params %>%
mutate(
morning = intercepts,
afternoon = intercepts + slopes
) %>%
ggplot( aes(x=morning, y=afternoon) ) +
mapply(function(level) {
stat_ellipse(geom = "polygon", type = "norm",
linewidth = 0, alpha = .1, fill = "#1c5253",
level = level)
},
# enter the levels here
level = c(1:9 / 10, .99)) +
geom_point(size = 2)+
labs(x="morning wait time",
y="afternoon wait time")
What is the covariance of our intercepts and slopes?
post = as_draws_df(m2)
rlkj_2 = rethinking::rlkjcorr(nrow(post), K=2, eta=2)
data.frame(prior= rlkj_2[,1,2],
posterior = post$cor_cafe__Intercept__afternoon) %>%
ggplot() +
geom_density(aes(x=prior, color = "prior"), alpha=.3) +
geom_density(aes(x=posterior, color = "posterior"), alpha=.3) +
scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
labs(x="R") +
theme(legend.position = "top")
Returning to our personality data from Tuesday.
data_path = "https://raw.githubusercontent.com/sjweston/uobayes/refs/heads/main/files/data/external_data/mlm.csv"
d <- read.csv(data_path)
d %>% count(wave)
## wave n
## 1 1 91
## 2 2 88
## 3 3 35
## 4 4 5
d %>% count(group)
## group n
## 1 CTRL 58
## 2 PD 161
d %>%
ggplot(aes(x=week, y=con, group=id)) +
geom_line(alpha=.3) +
facet_wrap(~group)
## Warning: Removed 3 rows containing missing values or values outside the scale range
## (`geom_line()`).
week is a within-person variable (Level 1), whereas
group is a between-person variable (Level 2).
Mathematical model with level-2 covariate
Likelihood function and linear model
\[\begin{align*} \text{con}_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{\text{id}[i]} + \beta_{\text{id}[i]}t_i \end{align*}\]
Varying intercepts and slopes with level-2 predictor:
\[\begin{align*} \alpha_{\text{id}[i]} &= \alpha + \gamma_\alpha G_{\text{id}[i]} + u_{\alpha,\text{id}[i]} \\ \beta_{\text{id}[i]} &= \beta + \gamma_\beta G_{\text{id}[i]} + u_{\beta,\text{id}[i]} \end{align*}\]
Random effects:
\[\begin{align*} \begin{bmatrix} u_{\alpha,\text{id}[i]} \\ u_{\beta,\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}\left(\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \mathbf{S}\right) \\ \mathbf{S} &= \begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix} \end{align*}\]
Priors: \[\begin{align*} \alpha &\sim \text{Normal}(0,1) \\ \beta &\sim \text{Normal}(0,1) \\ \gamma_\alpha &\sim \text{Normal}(0,1) \\ \gamma_\beta &\sim \text{Normal}(0,1) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
m3 <- brm(
data=d,
family=gaussian,
con ~ 1 + week + group + week:group + (1 + week | id),
prior = c( prior( normal(0,1), class=Intercept ),
prior( normal(0,1), class=b ),
prior( exponential(1), class=sigma ),
prior( exponential(1), class=sd ),
prior( lkj(2), class=cor)),
iter=4000, warmup=1000, seed=9, cores=4, chains=4,
file=here("files/models/71.3")
)
Divergent transitions! What do we do?
m3
## Warning: There were 17 divergent transitions after warmup. Increasing
## adapt_delta above 0.9 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: con ~ 1 + week + group + week:group + (1 + week | id)
## Data: d (Number of observations: 216)
## Draws: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
## total post-warmup draws = 12000
##
## Multilevel Hyperparameters:
## ~id (Number of levels: 88)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.06 0.01 0.05 0.07 1.00 4017 6305
## sd(week) 0.00 0.00 0.00 0.01 1.00 1909 2432
## cor(Intercept,week) -0.08 0.39 -0.75 0.72 1.00 12802 7985
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.20 0.01 0.17 0.23 1.00 5915 7919
## week 0.00 0.00 -0.01 0.01 1.00 10943 7070
## groupPD -0.01 0.02 -0.04 0.02 1.00 6071 7209
## week:groupPD -0.01 0.01 -0.02 0.01 1.00 11028 6548
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.05 0.00 0.04 0.05 1.00 3847 4795
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
as_draws_df(m3) %>%
ggplot( aes(x=b_Intercept, y=b_groupPD) ) +
geom_point()
Center your variables:
d = d %>%
mutate(
across( c(con, week),
~.- mean(., na.rm=T),
.names="{col}_c"))
m3b <- brm(
data=d,
family=gaussian,
con_c ~ 1 + week_c + group + week_c:group + (1 + week_c | id),
prior = c( prior( normal(0,1), class=Intercept ),
prior( normal(0,1), class=b ),
prior( exponential(1), class=sigma ),
prior( exponential(1), class=sd ),
prior( lkj(2), class=cor)),
iter=4000, warmup=1000, seed=9, cores=4, chains=4,
file=here("files/models/71.3b")
)
p1 = as_draws_df(m3) %>%
ggplot( aes(x=sd_id__week,
y=cor_id__Intercept__week) ) +
geom_point()
p2 = as_draws_df(m3b) %>%
ggplot( aes(x=sd_id__week_c, y=cor_id__Intercept__week_c) ) +
geom_point()
p1 + p2